大型並行處理器程式設計：實務導向教學：異質化環境：為什麼選擇 OpenCL？

在……時代 同質化運算——其中單一中央處理器（CPU）負責所有任務——已達其物理極限。如今，我們處於一個異質化環境性能由多種專業化硬體協奏推動：圖形處理器（GPU）專精於吞吐量，現場可程式化邏輯閘陣列（FPGA）擅長邏輯運算，數位訊號處理器（DSP）則用於訊號處理。

1. 向異質化轉變

現代的計算效能提升，不再來自提高原始時鐘頻率，而是來自整合專用 加速元件。異質化系統利用一個主機（通常為多核心中央處理器）來協調跨不同 計算裝置之間的任務，每種裝置具有獨特的記憶體與執行特性。

2. OpenCL 裝置模型

OpenCL（開放式運算語言）提供一個統一的框架來管理這種多元性。它將每一項硬體視為一個裝置，並分割成 計算單元（CU）。透過平台層，開發者可在執行階段查詢裝置特定功能，例如時鐘速度與記憶體大小，讓同一段程式碼能適應不同廠商的硬體。

3. 可移植性與效率的權衡

雖然 OpenCL 提供了 程式碼可移植性 （撰寫一段核心程式碼以適用所有廠商），但其真正強大之處在於 可移植的高效能：賦予開發者細緻的控制能力，以針對每個獨特平台的底層架構差異進行最佳化執行。

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

Read the “OpenCL Platform Layer” section of the OpenCL specification. Compare the platform querying API functions with what you have learned in CUDA.

CUDA and OpenCL both use a single function to find devices without vendor platforms.

OpenCL requires a hierarchical query (Platform then Device), while CUDA queries devices directly.

OpenCL cannot query device capabilities at runtime, whereas CUDA can.

OpenCL platforms are equivalent to CUDA streaming multiprocessors.

✅ Correct!

In CUDA, hardware discovery is simpler (cudaGetDeviceCount) because it targets one vendor. OpenCL requires clGetPlatformIDs (to find vendors like NVIDIA/Intel) and then clGetDeviceIDs to handle the heterogeneous landscape.

QUESTION 2

What is the primary role of the 'Host' in a heterogeneous system?

To perform all high-throughput mathematical calculations.

To act as the conductor, orchestrating tasks across specialized devices.

To replace the GPU for graphics rendering.

To provide power only to the FPGA.

QUESTION 3

How does OpenCL abstract hardware units like a Streaming Multiprocessor (SM)?

As a Processing Element (PE).

As a Compute Unit (CU).

As a Memory Bank.

As a Platform Identifier.

QUESTION 4

Why is 'Portable Efficiency' valued over simple 'Performance Portability' in OpenCL?

Because code that runs on everything automatically runs at peak speed.

Because it allows developers to tune code for specific architectural nuances while keeping the source portable.

Because it removes the need for kernel optimization.

Because OpenCL only supports CPUs.

QUESTION 5

Which OpenCL constant is used to query for any hardware device type (CPU, GPU, etc.)?

CL_DEVICE_TYPE_GPU

CL_DEVICE_TYPE_ALL

CL_DEVICE_VENDOR_ONLY

CL_PLATFORM_ALL

✅ Correct!

CL_DEVICE_TYPE_ALL allows the host to discover all supported compute devices in the heterogeneous system.